-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add controller flag to turn off built-in resolution #4168
Conversation
The following is the coverage report on the affected files.
|
I originally tried to refactor the existing reconcilers to depend entirely on these new resolvers in #4108 but doing so was far too invasive and the PR was going to be just ridiculously huge. The main mistake I made was trying to give the resolver reconcilers total ownership of the resources before the taskrun/pipelinerun reconcilers saw them. I did this by using a filtering handler to effectively "hide" unresolved TaskRuns/PipelineRuns from the taskrun/pipelinerun reconcilers. But taking that approach meant duplicating massive chunks of the existing TaskRun and PipelineRun reconcilers in order to account for every single feature that occurs before or during resolution. For example: metrics, event emitting, cancellation handling, pending logic, timeouts, error messaging etc etc etc. All can interact with or be influenced by the resolution process in one way or another, so had to be kept in-tact when the resolver reconcilers were in control. So instead I'm taking a different tack: First I introduce the new resolution reconcilers to Pipelines behind an experimental controller arg and add a feature flag which disables the taskrun/pipelinerun resolution process. This means users who aren't interested in trying this stuff out won't be burdened by the changes. Second is a different architectural approach: now the taskrun and pipelinerun reconcilers don't ignore unresolved resources - they're processed as normal. Instead they simply stop reconciling them if they recognize that resolution hasn't happened yet. This means they still do stuff like setting and handling timeouts / cancellation / pending states. And the error messaging and events emitted remain entirely the same as before. The only difference is that resolution can now happen "out of band" and they'll wait for this to occur before they try and build DAGs or spin up pods. Seems to work pretty well 👍 |
/hold Let's figure out if there's agreement on tektoncd/community#493 before this gets reviewed or merged. |
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
/test pull-tekton-pipeline-integration-tests |
/test pull-tekton-pipeline-alpha-integration-tests |
That darn |
/test pull-tekton-pipeline-alpha-integration-tests |
docs/developers/README.md
Outdated
`-experimental-enable-resolution-reconcilers` as an additional `arg` passed to | ||
the `tekton-pipelines-controller` in its [Deployment | ||
YAML](../../config/controller.yaml). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is probably a silly question but im wondering where the line is b/w command line args and configuration we configure via config maps - could it make sense that both of these options are configurable via config map?
(it seems like we're mostly just using args for images - not really 100% sure why those wouldn't be configurable via config map also)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh the code is right above - also threads-per-controller, disable-ha, namespace - maybe the idea is these CAN'T change at runtime? (unless they can)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say yes, can't change at runtime, and need a very very explicit change from the user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two considerations:
- There's only one spot in our codebase where we can "spin up" reconcilers. That's in
cmd/controller/main.go
. At this early stage in the program's startup we don't have access to configmap values yet. - The reconcile loops (
ReconcileKind()
) of our reconcilers don't have direct access to the command line flags passed to the application. We have no existing avenue to pass those flags from the controller'smain.go
down to the reconcilers.
I'm happy to create a way for passing the values from CLI args down to reconcilers - is that what we're discussing here? I don't think we can spin up / tear down the resolution reconcilers via runtime configmap changes tho.
Edit: We could pass the values of a hypothetical -experimental-disable-in-tree-resolution
CLI flag via context.Context
to the reconcilers. This would probably be the way I'd implement it if that's the preferred route here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realise I didn't really address the original point - where the line is b/w command line args and configuration we configure via config maps. I didn't approach this with a clear line in mind. It was just easiest to plumb the configuration through from both of these sides to their respective points in the application. Open to changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no problem at all, thanks for the detailed explanation @sbwsg - it sounds like the line (at least for now) is configuration that impacts how the reconcile loops are started maybe? config for that needs to be passed via command line args, and the rest can be config map
so long story short sounds like you made the right choice to configure this via command line arg
my only remaining question is if we want both options (both the command line arg and the config map param) - i think im not focusing on the most important parts of this proposal so feel free to just make a choice and move on, but my thought is we could initially just have one way of configuring this (maybe the command line option) which both enables the experimental reconciler and disables the current resolution (esp since if this becomes our long term solution im guessing we'd rename or remove experimental-disable-ref-resolution
anyway) - but also ignore me if that's just more work to make happen than its worth
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nevermind, my bad, i just thought this through and realized of course you want 2 flags because you want to be able to:
- Turn off resolution and use the new reconcilers to resolve
- Turn off resolution but use some different reconciler to resolve (e.g. git)
👍 👍 👍 to your approach
Added to milestone 0.28 🤞 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a great incremental step forward!!
/approve
my last 2 cents is wondering at what point we want to switch to this being the default; right now if there are 2 paths that do the resolution this might present some challenges if folks try to maintain this and change one path and not the other
(maybe we can do something like: 1) after the next release, switch the end to end tests to use this second reconciler by default, 2) by the following release make this the default behavior and then 3) by the release after completely remove the old way of doing resolution?)
|
||
var _ controller.Reconciler = &TaskRunResolverReconciler{} | ||
|
||
func (r *TaskRunResolverReconciler) Reconcile(ctx context.Context, key string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this is happening somewhere else but im wondering if these new reconcilers should check if experimentalDisableRefResolution
is set before they do anything?
im definitely over thinking it but im wondering what will happen if the controller is configured to start these but experimentalDisableRefResolution
is false - i.e. both the existing and the new reconcilers try to resolve the refs at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great point. They'd collide spectacularly. I think for this reason it would be better to have two CLI flags and eliminate the ConfigMap entry. That way we can error out early during controller startup if the user incorrectly sets -experimental-enable-resolution-reconcilers
without switching off the existing resolution machinery.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bobcatfish The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The following is the coverage report on the affected files.
|
I've updated the PR to use two CLI arguments instead of a ConfigMap / CLI split. /hold cancel |
Generally multiple controllers updating the same resource's status is a "code smell", and it looks like this is doing a bunch of gymnastics to work around the reconciler framework, which I'm guessing is at least somewhat related. I have only paged in a little bit of the context on this, but would be happy to chat about what you are trying to do and brain storm paths forward for this that don't deviate from best practices. |
Hi @vdemeester @mattmoor I'd like to get this in for 0.28. I'm happy to drop the second reconciler from this PR completely and just leave the This gives us enough room to start trying out the alternatives from the remote resolution TEP in an experimental controller. |
Here's the additional context for this change (taken from TEP 0060):
|
I've reduced this PR to just the controller flag which disables built-in TaskRun and PipelineRun resolution. This gives an experimental controller just enough wiggle room to implement some of the alternatives described in TEP-0060 without racing against the PipelineRun and TaskRun reconcilers. |
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
cmd/controller/main.go
Outdated
@@ -110,8 +114,8 @@ func main() { | |||
|
|||
ctx = filteredinformerfactory.WithSelectors(ctx, v1beta1.ManagedByLabelKey) | |||
sharedmain.MainWithConfig(ctx, ControllerLogKey, cfg, | |||
taskrun.NewController(*namespace, images), | |||
pipelinerun.NewController(*namespace, images), | |||
taskrun.NewController(*namespace, images, *experimentalDisableInTreeResolution), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's sort of a misnomer, but it feels like it'd be nice if this was passed through images
🤔
Since this is a breaking signature change anyways, I wonder whether it's worth renaming the struct and passing this field through it instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're missing a bit here - why's it preferable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a struct for each controller to pass the images and disableresolution flag through.
The following is the coverage report on the affected files.
|
We're currently trying to work out what the best way forward for TEP-0060 (remote resolution) is for Tekton Pipelines. As part of that we're creating a controller in the experimental repo that we can use to test out a bunch of the Alternatives from the TEP and figure out what works best in terms of implementation and DX. Right now the experimental controller would race with the Pipelines controllers since they're all going to try and "resolve" any taskRef in Taskruns and PipelineRuns. This commit adds a flag to the pipelines controller that explicitly switches off resolution performed by the taskrun and pipelinerun reconcilers. This gives us just enough room to perform resolution out-of-band and try out some of the alternatives. When the `-experimental-disable-in-tree-resolution` flag is set the taskrun and pipelinerun controllers will ignore taskruns and pipelineruns that don't have `status.taskSpec` or `status.pipelineSpec` populated.
The following is the coverage report on the affected files.
|
/test pull-tekton-pipeline-alpha-integration-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
taskrunControllerConfig := taskrun.ControllerConfiguration{ | ||
Images: images, | ||
DisableTaskRefResolution: *experimentalDisableInTreeResolution, | ||
} | ||
|
||
pipelinerunControllerConfig := pipelinerun.ControllerConfiguration{ | ||
Images: images, | ||
DisablePipelineRefResolution: *experimentalDisableInTreeResolution, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems inconsistent with pipeline.Images
for these to specialize between the two controllers, since the pipeline run controller only uses a subset of the pipeline.Images
struct, but the same "options" struct is passed to both controllers.
My personal preference would be to rename/generalize the Images
struct to reflect a more complete set of options, but this seems nitpicky and I have some other changes I've been thinking about to some of this handling, so perhaps we land this and I can try to find something we're happy with.
thanks for the change. 👍
} | ||
|
||
var ( | ||
// Check that our Reconciler implements pipelinerunreconciler.Interface | ||
_ pipelinerunreconciler.Interface = (*Reconciler)(nil) | ||
|
||
// Indicates pipelinerun resolution hasn't occurred yet. | ||
errResourceNotResolved = fmt.Errorf("pipeline ref has not been resolved") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: errors.New
when there's no formatting.
🤔 I thought this was something golangci-lint would complain about, sorry I didn't notice before.
Also cc @n3wscott in case there's something useful in the controller framework we can use here to simplify any of the logic. None of the things that come to mind for me seem quite right.
Changes
We're currently trying to work out what the best way forward for TEP-0060
(remote resolution) is for Tekton Pipelines. As part of that we're
creating a controller in the experimental repo that we can use to test out
a bunch of the Alternatives from the TEP and figure out what works best
in terms of implementation and DX.
Right now the experimental controller would race with the Pipelines controllers
since they're all going to try and "resolve" any taskRef in Taskruns and
PipelineRuns.
This commit adds a flag to the pipelines controller that explicitly switches
off resolution performed by the taskrun and pipelinerun reconcilers. This gives
us just enough room to perform resolution out-of-band and try out some of the
alternatives.
When the
-experimental-disable-in-tree-resolution
flag is set the taskrunand pipelinerun controllers will ignore taskruns and pipelineruns that don't
have
status.taskSpec
orstatus.pipelineSpec
populated.Submitter Checklist
As the author of this PR, please check off the items in this checklist:
functionality, content, code)
Release Notes